Goto

Collaborating Authors

 ca 1


A model of the hippocampus combining self-organization and associative memory function

Neural Information Processing Systems

A model of the hippocampus is presented which forms rapid self -orga(cid:173) nized representations of input arriving via the perforant path, performs recall of previous associations in region CA3, and performs comparison of this recall with afferent input in region CA 1. This comparison drives feedback regulation of cholinergic modulation to set appropriate dynamics for learning of new representations in region CA3 and CA 1. The network responds to novel patterns with increased cholinergic mod(cid:173) ulation, allowing storage of new self-organized representations, but responds to familiar patterns with a decrease in acetylcholine, allowing recall based on previous representations. This requires selectivity of the cholinergic suppression of synaptic transmission in stratum radiatum of regions CA3 and CAl, which has been demonstrated experimentally.


Relational VAE: A Continuous Latent Variable Model for Graph Structured Data

Mylonas, Charilaos, Abdallah, Imad, Chatzi, Eleni

arXiv.org Machine Learning

Graph Networks (GNs) enable the fusion of prior knowledge and relational reasoning with flexible function approximations. In this work, a general GN-based model is proposed which takes full advantage of the relational modeling capabilities of GNs and extends these to probabilistic modeling with Variational Bayes (VB). To that end, we combine complementary pre-existing approaches on VB for graph data and propose an approach that relies on graph-structured latent and conditioning variables. It is demonstrated that Neural Processes can also be viewed through the lens of the proposed model. We show applications on the problem of structured probability density modeling for simulated and real wind farm monitoring data, as well as on the meta-learning of simulated Gaussian Process data. We release the source code, along with the simulated datasets.


Partially Observable Online Change Detection via Smooth-Sparse Decomposition

Guo, Jie, Yan, Hao, Zhang, Chen, Hoi, Steven

arXiv.org Machine Learning

We consider online change detection of high dimensional data streams with sparse changes, where only a subset of data streams can be observed at each sensing time point due to limited sensing capacities. On the one hand, the detection scheme should be able to deal with partially observable data and meanwhile have efficient detection power for sparse changes. On the other, the scheme should be able to adaptively and actively select the most important variables to observe to maximize the detection power. To address these two points, in this paper, we propose a novel detection scheme called CDSSD. In particular, it describes the structure of high dimensional data with sparse changes by smooth-sparse decomposition, whose parameters can be learned via spike-slab variational Bayesian inference. Then the posterior Bayes factor, which incorporates the learned parameters and sparse change information, is formulated as a detection statistic. Finally, by formulating the statistic as the reward of a combinatorial multi-armed bandit problem, an adaptive sampling strategy based on Thompson sampling is proposed. The efficacy and applicability of our method in practice are demonstrated with numerical studies and a real case study.


Acceleration of Descent-based Optimization Algorithms via Carath\'eodory's Theorem

Cosentino, Francesco, Oberhauser, Harald, Abate, Alessandro

arXiv.org Machine Learning

We propose a new technique to accelerate algorithms based on Gradient Descent using Carath\'eodory's Theorem. In the case of the standard Gradient Descent algorithm, we analyse the theoretical convergence of the approach under convexity assumptions and empirically display its ameliorations. As a core contribution, we then present an application of the acceleration technique to Block Coordinate Descent methods. Experimental comparisons on least squares regression with a LASSO regularisation term show remarkably improved performance on LASSO than the ADAM and SAG algorithms.


Eigendecomposition of Q in Equally Constrained Quadratic Programming

Yu, Shi

arXiv.org Machine Learning

When applying eigenvalue decomposition on the quadratic term matrix in a type of linear equally constrained quadratic programming (EQP), there exists a linear mapping to project optimal solutions between the new EQP formulation where $Q$ is diagonalized and the original formulation. Although such a mapping requires a particular type of equality constraints, it is generalizable to some real problems such as efficient frontier for portfolio allocation and classification of Least Square Support Vector Machines (LSSVM). The established mapping could be potentially useful to explore optimal solutions in subspace, but it is not very clear to the author. This work was inspired by similar work proved on unconstrained formulation discussed earlier in \cite{Tan}, but its current proof is much improved and generalized. To the author's knowledge, very few similar discussion appears in literature.


Can Embeddings Adequately Represent Medical Terminology? New Large-Scale Medical Term Similarity Datasets Have the Answer!

Schulz, Claudia, Juric, Damir

arXiv.org Artificial Intelligence

A large number of embeddings trained on medical data have emerged, but it remains unclear how well they represent medical terminology, in particular whether the close relationship of semantically similar medical terms is encoded in these embeddings. To date, only small datasets for testing medical term similarity are available, not allowing to draw conclusions about the generalisability of embeddings to the enormous amount of medical terms used by doctors. We present multiple automatically created large-scale medical term similarity datasets and confirm their high quality in an annotation study with doctors. We evaluate state-of-the-art word and contextual embeddings on our new datasets, comparing multiple vector similarity metrics and word vector aggregation techniques. Our results show that current embeddings are limited in their ability to adequately encode medical terms. The novel datasets thus form a challenging new benchmark for the development of medical embeddings able to accurately represent the whole medical terminology.


Hybrid Batch Bayesian Optimization

Azimi, Javad, Jalali, Ali, Fern, Xiaoli

arXiv.org Artificial Intelligence

Bayesian Optimization aims at optimizing an unknown non-convex/concave function that is costly to evaluate. We are interested in application scenarios where concurrent function evaluations are possible. Under such a setting, BO could choose to either sequentially evaluate the function, one input at a time and wait for the output of the function before making the next selection, or evaluate the function at a batch of multiple inputs at once. These two different settings are commonly referred to as the sequential and batch settings of Bayesian Optimization. In general, the sequential setting leads to better optimization performance as each function evaluation is selected with more information, whereas the batch setting has an advantage in terms of the total experimental time (the number of iterations). In this work, our goal is to combine the strength of both settings. Specifically, we systematically analyze Bayesian optimization using Gaussian process as the posterior estimator and provide a hybrid algorithm that, based on the current state, dynamically switches between a sequential policy and a batch policy with variable batch sizes. We provide theoretical justification for our algorithm and present experimental results on eight benchmark BO problems. The results show that our method achieves substantial speedup (up to %78) compared to a pure sequential policy, without suffering any significant performance loss.